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Abstract 

When it comes to clinical survival trials, regulatory restrictions usually require the application 
of methods that solely utilize baseline covariates and the intention-to-treat principle. Thereby a 
lot of potentially useful information is lost, as collection of time-to-event data often goes hand 
in hand with collection of information on biomarkers and other internal time-dependent covari¬ 
ates. However, there are tools to incorporate information from repeated measurements in a useful 
manner that can help to shed more light on the underlying treatment mechanisms. We consider 
dynamic path analysis, a model for mediation analysis in the presence of a time-to-event outcome 
and time-dependent covariates to investigate direct and indirect effects in a study of different 
lipid lowering treatments in patients with previous myocardial infarctions. Further, we address 
the question whether survival in itself may produce associations between the treatment and the 
mediator in dynamic path analysis and give an argument that, due to linearity of the assumed 
additive hazard model, this is not the case. We further elaborate on our view that, when studying 
mediation, we are actually dealing with underlying processes rather than single variables mea¬ 
sured only once during the study period. This becomes apparent in results from various models 
applied to the study of lipid lowering treatments as well as our additionally conducted simulation 
study, where we clearly observe, that discarding information on repeated measurements can lead 
to potentially erroneous conclusions. 


1 Introduction 


Survival and event history analysis has become a central interest in clinical biostatistics, as many clini¬ 
cal trials are designed to investigate the effect of certain treatments on the time until the occurrence of 
a particular event of interest, such as death or disease progression. As required by regulatory authori¬ 
ties, typical approaches to assess these effects comprise of Kaplan-Meier plots along with log-rank tests 
or of employing a Cox regression model including treatment and baseline covariates. Those analysing 
strategies mainly focus on answering the rather pragmatic question ’Does treatment work?’. 

Often collection of time-to-event data goes hand in hand with the collection of information on biomark¬ 
ers and other internal time-dependent covariates, which are hardly ever used in the final analysis. 
Thereby a lot of useful information is ignored, that could be used to address the more exploratory 
question ’How does treatment work?’. A key tool to approaching answers to that question is media¬ 
tion analysis, that allows for a decomp osition of the total tr eatment effect into a direct effect and an 
indirect effect. Emsley and co authors Emslev et al.l ( 20101) give a historical overview of approaches 
to mediation analysis and thereby discuss two - at first glance distinct appearing - traditions. On the 
one hand, the estimation of d irect an d indir e ct effects by the method of p a th an al ysis together with 
struct ural equation modeling IWright ( 1934 ): Duncanl ( 19661) : iGoldber^ ( 1972h : Baron and Kenny 
( 19861) mainly motivated by social and behavioural sciences. On the other hand, the ’causal inference 
approach’, mainly developed by statisticians and econometricians focu sing on assum p tions needed fo r 
the identification of direct and indirect effects to draw va l id inf e rence Rubin (Il974 ): Holland ( 1988 ): 
Robins and Greenlandl ( 19921) ; Hafeman and VanderWeele ( 2011 ); Gole and Hernan ( 2002h . also point¬ 
ing out the problems that would occur when path-tracing rules and estimation techniques from linear 
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models are transferred to non-linear models iKaufman et al.l (200#;|PeaiJ ( 2012 1. 


Besides other obvious connections between these two traditions, one common issue was, that until the 
last decade there was no straightforward approach on how to han dle the estimation of direct and indi¬ 
rect effects for survival outcomes. In 2006, Fosen and co-authors Fosen et al. ( 2006bO proposed the 
model of dynamic path analysis, which w e will mainly be con cerned with in this paper, based on linear 
regression and the additive hazard model lAalenl ( 1980l 1989ll . The approach was originally motivated 
by recurrent event modeling and would - following the cartegorisation by Emsley and co-authors - fall 
into the tradition of path analysis. Lange and Hansen iLange and HansenI ( 201111 developed a method 
from a causal inference perspective and presented a way to obtain natural direct and indirect effects. 
They make use of the additive hazard model as well, however, their approach is restricted to the setting 
of a time-fixed, normally distributed mediator. 

The model of dynamic path analysis enables us to exploit more in formation from data routinely col¬ 


lected within clinical trials (or observational studies, as illustrated in lRovsland et al.l (|2nil[ l: lGamborg et al 


1 201lh l by utilizing repeated measurements of the mediator. The model is based on the idea that we 
are actually d ealing with continu ous processes that evolve over time rather than fixed variables, as 
pointed out in lAalen et al.l ( 2014 ). and aims on modeling the effects of several covariate processes on 
the occurrence of the event of interest and the r elation b etwee n the covariate processes. It can be 
viewed as an extension of classical path analysis IWrighd (1934) and the concept of directed acyclic 
graphs (DAGs), an important tool in causal inference, to settings that involve time-to-event outcomes 
and time-dependent covariates. The extensions are essentially, that a DAG is defined at each event 
time and forms therefore a stochastic process in itself, that the path coefficients may change over time 
and that the outcome is the occurrence of an event. However, making use of the additive hazard model 
for the time-to-event outcome and linear regression for the treatment-mediator relationship, preserves 
the rules for multiplying coeffici ents along paths , whic h would not be meanin gful in non-linear models, 
like for exam ple the Cox model [Kaufman et al.l ( 20041 ): VanderWeelel ( 201 ll ). 

Martinussen iMartinussenI ( 2010[) derived the large sample properties of the dynamic path analysis 
model and he further states that under the additional assumptions that the treatment-outcome rela¬ 
tion and the mediator-outcome relation are un-confounded and there are no interactions between the 
treatment and the mediator, the obtained estimates could be interpreted as truly causal effects. He, 
however, e mphasises the notion that no unique d e finitions of the concepts of direct, indirect and total 
effects exist Pearl ( 2001 ): Robins and Greenland ( 1992 ): Goetgeluk and Vansteeland^ ( 20091) . 

Further considerations within the field of causal inference motivated by the model have mainly ad¬ 
dressed various scenarios where measured or unmeasured confounders could occur on the pathway 
between the mediator and the outcome, which could themselves be affected by the exposure. For 


example, iMartinussen et al.l (120111) propose a two-stage estimator for the controlled direct effect of 
a point exposure on a survival outcome under one particular confounding situation. However, that 
approach is limited to a time-fixed intermediate variable as well. 

In this paper, we will assume that the treatment-outcome as well as the mediator-outcome relation¬ 
ships are unconfounded and that no treatment-mediator interactions are present, and rather address 
the question whether survival in itself may produce associations between the treatment and the me¬ 
diator in dynamic path analysis. We give an argument showing that, due to linearity of the additive 
hazard model, this is not the case, so selection by survival does not produce artificial association. This 
implies the important property that, in case one finds the treatment mediator effect to be changing 
over time, this is due to a real change of the effect and not caused by the survival selection mechanism. 
A further important side-product of our argument is that covariates that affect survival in an additive 
manner but are not considered as confounders will not turn into a confounders as time passes by. 

For ease of presentation we present the underlying derivations for a single time point in the main 
text. However, we provide an appendix that contains a generalisation of that argument, but involves 
a mathematically more precise and therefore perhaps more complicated appearing notation. 

To elaborate further on our view, that we are actually dealing with processes, we performed a simulation 
study to compare the application of dynamic path analysis in the situation where the mediator is only 
measured at one point in time compared to utilizing several measurement in a setting where we actually 
assume the mediator to be a time-varying process. We can clearly observe that discarding information 
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tends to obscure potential inference about the underlying mechanisms. Furthermore, we have data 
available that were collected within t he IDEAL study proj ect (Incremental Decrease in End Points 
Through Aggressive Lipid Lowering, Pedersen et al.l (200^), a multi-center clinical trial designed to 
compare the effects of two different lipid lowering strategies on the risk of cardiovascular disease among 
patients with established coronary heart disease (CHD). Enrolled patients were asked to repeatedly 
return to the study center at pre-scheduled time points for monitoring their lipid values along with 
other laboratory measurements. The trial was designed based on the mechanistic understanding that 
statins would lower low density lipoprotein (LDL)-cholesterol levels, what would, in turn, result in a 
reduced risk of coronary events. However, the trial did not show the expected results regarding the 
primary outcome. Among other more in-depth considerations, the research ers also investiga t ed th e 
proportion of the treatment effect mediated through different lipid measures Boekholdt et al.l ( 20121) . 
but the app li ed me t hods were based on uti lizing only a single measurement at a particular time point 
Simes et al. ( 20021) : Freedman et al.l ( 19921 ). We employ dynamic path analysis to illustrate how the 


direct and the indirect effects develop over time and also use a more broadly defined outcome, to utilize 
more of the collected information. 

The paper is structured as follows. First, we describe the concept of dynamic path analysis and the re¬ 
spective estimation equations, followed by our argument concerning conditioning on survival in Section 
[21 In Section 0] we present results from our simulation study, followed by the results from applica¬ 
tion to the IDEAL data in Section [SI and a discussion in Section In an appendix we present the 
generalisation of the argument explained in Section [3] to various time points and motivate the causal 
interpretation of the described direct and indirect effects from Section [51 


2 Dynamic path analysis 

As mentioned above the idea behind dynamic path analysis is to define a series of graphical models, 
that depict the relationship between a time-fixed treatment (more baseline covariates could be added, 
without destroying the big picture), a covariate process and the occurrence of the event of interest, 
modelled as the infinitesimal change of a counting process. More specifically, we consider the situation 
illustrated in Figure [l] where Xi would represent the statin treatment at randomisation in our appli¬ 
cation, X 2 {t—) stands for the time-dependent mediator, e.g. LDL-cholesterol, and dN{t) essentially 
refers to a jump in the counting process, which would then be a ’any CHD event’. 


X2{t-) 



^P3,2it)dt 

. dN{t) 


183,1 (t)dt 


Figure 1: Illustration of one time-local DAG. 


To perform a dynamic path analysis one has to regress each node in the diagram onto its parents for 
each event time. For the mediator-exposure relationship this can be done by ordinary line ar regression , 
wher eas for the counting process increment we make use of the additive hazard model Aalen ( 198(1 
19891) . 


Let us be more specific. Let N{t) = {N{t)-,t G [0,t]} denote a right continuous counting process for 
one particular individual. Given that we are considering survival data, N(t) will start at 0 and jump 
to 1 if the individual experiences the event of interest. We want to model the ’jump-intensity’ within 
a small time interval, given all information that has been observed prior to that interval, which we 
heuristically denote as ’past’. Let dN{t) denote the increment of the counting process N{t) over a 
small interval [t, t -I- dt), then we can express the intensity process a{t) as 


a{t)dt = P{dN{t) = 1| past), 
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which we can alternatively express as E\dN( t)\ past], since dN(t) i s a binary variable. A pplying well- 
known results from counting process theory lAndersen et al. (199^; Aalen et al. ( 20n8h . we have the 
following decomposition 

dN{t) = X{t)dt + dM (t), 

where dM{t) is a martingale increment. Note that the decomposition is of the form ’observation = 
signal -I- noise’ with dN{t) interpreted as th e observation, X(t)d t as the signal and dM{t) as the noise. 
We further assume independent censoring ( Aalen et al. ( 20081) . Chapter 2) and introduce an ’at risk 
indicator’ Y(t) at time t. 

As stated above, for dynamic path analysis we want to make use of the additive hazard model to 
describe the relationship between the change in the counting process and the covariate processes, 
which takes following form 


dN{t) — Y{t){f33fi{t) + /33^i(t)Ai -|- /?3,2(i)W2(t—) -|- f33^3{t)Zi -f • • ■ f33^p+2{t)Zp}dt + dM{t). (1) 

Here the first term on the right-hand side describes the intensity process as the product of the ’at 
risk indicator’, Y(t), and a hazard function, a(t), corresponding to the term in curly brackets. Fur¬ 
ther, the Psjit) are arbitrary regression functions, Xi denotes the time-fixed treatment, X 2 (t—) the 
time dependent mediator and the vector Z relevant baseline covariates. For the treatment-mediator 
relationship we assume a linear relationship 

^2{i) = i>2fl{t) -\- h2,i{t)Xi -\- b2y{t)Zi -f ■ • • -f b2^p+i(t)Zp + W2{t), (2) 

where in this case b 2 j{t) denote the respective regression coefficients at time t and W 2 {t) the error 
term at time t. In applications, of course, different baseline covariates can be incorporated in ([T]) and 
([2]) depending on the setting, but we will use the index p throughout for the sake of simplicity. Further 
we also assume 

E[X2(t)\Xi, Z,T > t] = &2,o(i) + b2pit)Xi + &2, 2 ( 0^1 + • • • + b2^p+i{t)Zp, 

what will be further discussed in Section [3] and the appendix. For consistency with later sections, we 
assume the following structure for Xi 


Al = b 


1.0 ' 


Wi, 


(3) 


with the error term Wi being independent of the error terms W 2 {t) at all times. Note that the case of 
a binary treatment, as e.g. in our application section, is also covered in that formulation. 

Use of the additive hazards model is a key ingredient in our approach. As described in section 2.1, 
estimation in the additive haz ards model focuses on the cumulative regression f unctions: see also sectio n 
4.2 in the book by Aalen et al. Aalen et al.l ( 2008fl . For this reason we will (as in Rovsland et al. ( 201lh ') 
define the cumulative indirect and direct effects from our structural equation models in equations ([T]- 
[3]) as 


dir(Ai N{t)) : / /33,i(s)ds 


(4) 


and 


indir(Ai A 2 (t) N{t)) : / &2,i(s)/33,2(s)ds. 


(5) 


It should be noted, however, that our interest is in the local indirect and direct effects. So when 
int erpreting the cumula tive effect estimates, we focus on their slopes. For more detailed elaborations 
see iFosen et al.l ( 2006bf) . where they also showed that the total cumulative effect can be obtained by 
simply taking the sum over the cumulative direct and indirect effects. 

In our appendix we motivate a causal interpretation of the time local direct and indirect effects, 
integration then gives the respective cumulative effects. Heuristically speaking, the indirect effect 
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reflects what would happen if we could manipulate the mediator X 2 (t) such that it behaved as if Xi 
had been perturbed, while we let Xi remain unperturbed. This idea resembles the definition of a 
natural indirect effect, but without making explicit use of the counterfactual framework. The idea 
underlying the direct effect is an intervention where Xi is perturbed, but the effect that perturbation 
would have had on X 2 {t) can be attenuated. For a more general and mathematically more precise 
description of these concepts it is referred to Sections B.2 to B.4 in the appendix. 


2.1 Parameter Estimation 


So far we have defined the cumulative direct and indirect eff ects in equations (l4ll an d ([5|) , now we briefly 


want to recapitulate the estimation equations according to iR.Ovsland et al.l (j2011h . Let us assume, we 
have I independent observations of the above defined quantities, with individual values denoted as 
Ni{t), Yi{t), Xu, X 2 i{t), Zii ,..., Zpi, Mi{t) for j = 1,..., / at time t . Let further L{t) be the matrix 
with the i th row being equal to Xu, X 2 i{t—), Zu,..., Zpi) and N(f) and dM(<) the vector 

consisting of the respective individual numbers of events Ni(t) and martingale increments dMi{t), as 
well as dB(t) = ■ • ■,/33,p+2(t)dt)^ the vector of regression functions. 

Then we can write © in matrix notation 


dN(t) = L(t)dB(t) +dM(t), 


which suggests that, given that L(t) has full rank, an estimator for the local effects dB(<), denoted as 
dB(t), can be obtained by solving the estimation equations 

L(t)'^dN(t) - L(t)^L(t)dB(t) = 0 

for every event time. The estimates dB(t) of the local effects may be quite variable. But we obtain 
stable estimates of the cumulative regression functions B(t) by aggregating the local effects over all 
event times up to and including time t. 

For the mediator processes X 2 {t) we assumed a linear model 

X2{t)=i{t)b2{t)+W2{t), 


where the i-th line of L(t) equals Yi{t){l, Xu, Zu, ■ ■ ■, Zpi) and W 2 {t) has zero-mean, is independent 
of L(t) and its components are uncorrelated. 

So the estimates b^it) can be deduced from the standard normal equations 

L(t)^X2(t) - t{t)'^t{t)b2{t) = 0 


at each event time. Now we can insert the obtained local estimates di? 3 _i(t) and dBz^ 2 {t) for I3^^i{t)dt 
and /33,2(t)dt and the respective OLS estimate for & 2 ,i(t) in equations (© and ©. Then the estimates 
of the dire ct and indirect cum ulative effects become sums over the observed event times. For more 
details see iFosen et al. ( 2006b[l . 


3 Survival collider issue 

In the previos section we have discussed how dynamic path analysis can be viewed as a series of 
local DAGs. However, when attempting to connect those DAGs as in Figure [5J one question became 
apparent and that was whether survival in itself produces associations between covariates. In our case 
that would imply that the association between the treatment and the mediator could partly be due to 
survival selection, i.e. an artefact. Just to clarify, consider Figure [U where Xi denotes the treatment, 
A" 2 (so) the mediator value measured at baseline, say sq and further dV(tj) the jump of the counting 
process at time tj. One could speculate that when estimating the respective effects at event time O one 
would potentially introduce spurious association between Xi and A' 2 (so) by conditioning on having 
survived event time ti, which appears as a collider in the DAG. 
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Figure 2: Simplest scenario where a survival collider effect could be suspected. 

The following argument indicates, however, that this is not a problem, so in that case conditioning 
on a collider (survival) does not produce artificial association. 

We will first look at the case of two covariates that are independent at baseline and show that inde¬ 
pendence is preserved under survival assuming the additive hazard model. In a second step, we will 
consider covariates that satisfy a linear structural equation model and directly relate that case to the 
simple dynamic path analysis model illustrated in Section [51 

3.1 Independence of covariates is preserved under survival in the additive 
hazard model 

Let Xi and X 2 here denote two independent covariates measured at baseline. Under a simple additive 
hazard model with no interactions, the probability of surviving up to time t is 

C{t) exp ^-( J ,5i(s)ds)Xi - (J /32(s)ds)X2^ , 

where C(t) results from taking the exponential of the integral over the baseline regression function. 
Let T denote the survival time. Then 

P{Xi = Xl,X2 = X2\T > t) 

P{Xi = Xi, X 2 = X2, T > t) 

^ P(T > t) 

_ P{T > t\Xi = Xi,X2 = X2)fxi{xi)fx2{x2) 

P{T > t) 

where fxi (a;i) and fx 2 {X 2 ) are the densities of Xi and X 2 at time 0. Inserting the survival probability 
gives us 


P{Xi = Xi,X2 = X2\T > t) 

C'(t)exp (^-(/J^i(s)ds)a:i - I32{s)ds)x2^ fx^{xl) fX 2 {x 2 ) 

^ P{T > t) ■ 

Hence the probability can still be factorized, so Xi and X 2 are still independent at time t conditional 
on survival. 

The result can immediately be generalised to an additive model with K independent covariates and 
no interactions. 

3.2 Covariates satisfying a linear structural equation model 

The above statement can be generalized to a very useful result about how the distribution of variables 
in a linear structural equation model are changing under survival selection. Let us define an ordered 
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linear structural equation model as follows (modification of Pearl (200^, equation (1.41), see also 


Loh and Blihlman: 


^uation ] 

^ dioTl i 


Xk — bk,o + bkjXj + Wk- 

j<k 

This is defined for fc = 1,..., n, and all Wk are assumed to be independent. Note that we assume this 
model to be structural, hence there are no unmeasured confounders. In matrix form the model can be 
written as follows with the solution in Wk on the right: 

X = bo+BX + W, X = {I-B)-^bo + {l-B)-^W. 

Here, B is a strictly lower triangular matrix. Assume that the hazard function is a linear combination 
of the A’s, then it is also a linear combination of the IT’s, and hence the independence of the IT’s are 
preserved conditional on survival up to a given time t because of Section [3Tj Let Et denote conditional 
expectation given survival up to time t, that is, T > then we have 

Et[Xk — bk,o — ''^bk,jXj\Xj,j < k] = Et[Xk\Xj,j < fc] — bk,o — ''^bk,iXj (6) 

j<k j<k 

Since the independence of the IT’s are preserved under survival, the left hand side of the above equation 
can be written in the following way: 

Et[Xk - bk,o - I < ^] = Et[Wk\Wj,j <k] = Et[Wk] 

j<k 

= Et[Xk — bkfi — ^ bkjXj] = Et[Xk] — bkft — ^ bk,jEt[Xj]. 
j<k j<k 


Comparing with ([6]), we have: 

Et[Xk\Xj,j <k]=J2 KjXj + Et[Xk] - Y, KjEtiXj]. 

j<k j<k 

Hence, given survival at time t we have a linear model with the same coefficients that we had at time 0. 
Hence there is a stable linear relationship between the variables under survival selection. The constant 
term changes, but that does not matter. 

3.3 Relation to a simple model for dynamic path analysis 

Let us explicitly point out how the results above relate to dynamic path analysis. Assume that Xi 
is treatment and ^ 2 ( 39 ) is the mediator, both measured at time zero. Assume that these covariates 
constitute a simple structural equation model, that is 

Xi = 61,0 + ITi, A2 (so) = b2,o{so) + b2,i{so)Xi + H"2(so). 

There are no unmeasured confounders either for the relationship between Xi and ^ 2 ( 39 ) nor for their 
influence on survival which is determined by the additive hazard model: 

o^{t) = Psyit) + j33^i{t)Xi + f33,2{t)X2{so). 

This is assumed to be a structural model. Given survival at any given time, say T > t, the relationship 
between Xi and ^ 2 ( 30 ) is still determined by the coefficient 62 , 1 ( 30 ). Hence, a simple calculation of 
local direct and indirect effects of treatment Xi on hazard would be: 


Direct effect: 
















3.4 Collapsibility in the additive model 


Indirect effect: &2,i(so)/33,2(t) 

Note, that the same argument would locally apply, once we obtain an updated measurement of the 
mediator, say X 2 (si), and we would consider estimation of the effects around the path plotted as 
dashed lines in Figure |3l It could occur that the estimate for the treatment-mediator relationship 
^ 2 .i(si)i differs from & 2 ,i(so)- This difference, however, will be due to an actual change in the effect 
that can be observed between the treatment and the updated measurement of the mediator but not 
due to any selection bias. Referring once more to our real data application, that would for example 
mean, that the effect of treatment on LDL-cholesterol levels indeed attenuates after some time. 


-^2(so) -^2(51) 



Figure 3: Local path of interest for the estimation of direct and indirect effects, when an updated 
measurement became available. 


3.4 Collapsibility in the additive model 


An important question is whether a factor that influences survival, but is un-related to other covariates, 
can become a confounder over time. Let us assume there is an unmeasured quantity, that is not 
considered a confounder at time 0 since it is only associated with survival and not with measured 
covariates and there are no interactions present. It can be a part of the structural equation model 
above, but with all the &’s equal to zero. It follows from the previous result that the b’s will stay 
at zero conditioned on survival. Hence, unmeasured confounders do not arise due to survival collider 
effects. If the Cox model is valid instead, this will not be true. In fact, any conceivable factor 
influencing survival will become a confounder when time passes. The underlying reason for or the 
di fferent behavior of the two models ari ses from the collapsibility of the additive model (as discussed 
in iMartinussen and Vansteelandt ( 2013f) L which does not hold true for the Cox model. 


4 Simulation study 


To further motivate investigations considering the mediator to be a time-varying process rather than 
a time-fixed covariate, we performed a simulation study. We compared dynamic path analysis using a 
single mediator measurement to dynamic path analysis using several measurements and also set those 
results in relation to the true data generating model. In the subsequent section we describe our data 
generating strategy for the survival outcome and mediator process and then show result s from various 
simulation scenarios. All simulations and analyses were performed in R, version 3.0.1 R Core Team 
( 2 ^ 


4.1 Data generation 

To generate data respecting the dynamic path analysis setting as presented in Section [2] with structural 
equations as 

= 61^0 + m 

^ 2 ( 3 ) = ^2,o(s) + ^2 ,i(s)A1i -I- W2{s) 
a{s) = ^3,o(s) + /33 ,i(s)Ai -|- ^3,2(s)A2(s), 
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we proceeded the following way: 

Let us first consider the simulation of the time-to-event data. Generally, we assumed a study period of 
5 years. This time period was discretised into equidistant grid points, where the distance, A, was set 
to 1 week. For each interval of size A the probability of an event within that interval can be expressed 
as 


P{th <T <th + A|T > th) = 1- exp 


/•ih,+A \ 

/ a(s)ds . 


Making use of the Taylor expansion this probability can be approximated by a{th)^ and we can, in 
turn, apply the additive form of the hazard a{th) including a time fixed or a time-dependent mediating 
variable. Note beforehand, that the choice of values in the subsequent paragraph might seem rather 
arbitrary at first glance, but the underlying idea was to create a situation that appears quite similar 
to the real data example presented in the next section. 

To model the flexible regression functions (s), we applied cubic splines and specified the function to 
take the values (0.04, 0.03, 0.02, 0.02) at years (0,1, 3, 5) for the effect of the mediating variable on the 
hazard and to take the values (—0.3, —0.1, —0.6, —0.05, —0.05, —0.05) at years (0, 0.2, 0.8,1.1, 3.5, 5) for 
the effect of treatment. To model a 50 : 50 randomisation, the treatment variable was simply sampled 
from a Bernoulli distribution. For the mediator process we first generated a baseline value, sampled 
from a normal distribution with mean 11 and standard deviation 1.5. Again, we assumed a time- 
varying effect of treatment on the mediator and modeled the time-dependent coefficient with a cubic 
spline function (specified at times (0,1, 2,3,4, 5) to take values (—0.1, —3, —2.2, —3.3, —2.9, —2.9)) and 
added the time-varying treatment component to the sampled baseline values of the mediator and 
additionally a noise term generated from a multivariate normal distribution with mean vector equal to 
zero and a covariance matrix with diagonal elements set to 0.05 and non-diagonal number set to 0.0 . 
After generating a treatment indicator, the mediator process and respective time-varying regression 
functions at each discrete time point th, we could insert the respective values into the function 

for the hazard and calculate event probabilities for each individual at each discrete time point. This 
enabled us to generate an event history for each individual by sampling from a Bernoulli distribution 
with the calculated event probability at each time point. If an individual was sampled to have an event 
at one particular time th its event indicator was set to one and the survival time to the discrete time 
th where the event occurred. Censoring times were generated from a uniform distribution to account 
in average for a 13% censoring rate. Individuals still alive and not censored at year 5 were censored 
at that time point due to end of study. 

4.2 Simulation scenarios 

As stated above the main objective was to investigate what one could potentially gain from using 
more than just one measurement of the mediator, assuming that the mediator is actually a time- 
varying process. So a natural scenario to start with was, to just take one ’snap-shot’ measurement 
from our generated mediator process and only utilize that measurement in the dynamic path analysis 
model. Our flexible simulation procedure, however, allowed us to take ’snap-shot’ measurements at 
various time points and to further make use of them in our model for analysis. For the ’snapshot’ 
measurements we simply picked values of the simulated mediator process at pre-specified time points 
for every individual still in the study at those time points. Again the time points were chosen to mimic 
the real data application. 

4.3 Simulation results 

In Figures |4] to |6] results based on 100 simulations based on trials of sample size 2000 are presented. 
The scenarios vary in how many ’snap-shot’ measurements of the mediator were taken. In all plots the 
true curve, displaying the assumed parameters for our simulations, is plotted as a dotted blue line, the 
curve obtained from dynamic path analysis only using one mediator measurement is represented as a 
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red dotted line, whereas the curve from dynamic path analysis using several mediator measurements 
is displayed as a black solid line. 


Total treatment effect 


Direct treatment effect 


Indirect effect of treatment 






— DPA: 2 measurements 

- - True mechanism 

DPA: 1 measurement 


Figure 4: Simulation results, comparing a dynamic path model (DPA) only utilizing a single measure¬ 
ment compared to a model using two measurement and the true model. 
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Total treatment effect 



Direct treatment effect 



0 1 2 3 4 

Time 


Indirect effect of treatment 



0 1 2 3 4 

Time 



— DPA: 3 measurements 

- - True mechanism 

DPA: 1 measurement 


Figure 5: Simulation results, comparing a dynamic path model (DPA) only utilizing a single measure¬ 
ment compared to a model using three measurement and the true model. 


Total treatment effect 


Direct treatment effect 


Indirect effect of treatment 





Effect of treatment on M 


Direct effect of M 




— DPA: 4 measurements 

- - ■ True mechanism 

DPA: 1 measurement 


Figure 6: Simulation results, comparing a dynamic path model (DPA) only utilizing a single measure¬ 
ment compared to a model using four measurement and the true model. 
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What can be observed throughout all considered scenarios is a tendency to underestimate the 
indirect effect while overestimating the direct effect of treatment on the survival outcome when only a 
single measurement is used. However, it also appears to be important at which particular time point 
or state of the mediator process the measurements are taken. For example, comparing Figures 0] and 
[S] one can see that the model using 2 measurements gives actually better results for most of the study 
period because the measurement of the mediator was taken at a time point where the treatment effect 
on the mediator roughly equaled the average treatment effect of the remaining study period. A similar 
pattern can be observed comparing Figures S] and [51 For the study period after year 2, both direct 
and indirect effects tend to deviate more from the truth in Figure [51 However, it can also be observed 
that having more frequent measurements before year 2 makes it by far more likely to actually capture 
the nature of the underlying process. 


5 Analysis of the IDEAL data 

In the following section we want to report the results from application of various dynamic path models 
to the data collected within the IDEAL trial. Again, we primarily want to focus on comparing models 
using a different amount of available mediator measurements. 


5.1 The IDEAL study in brief 


Recall, that we are dealing with a secondary prevention trial, that aimed to compare two different statin 
treatment strategies (high dose of atorvastatin and standard dose of simvastatin) to lower cholesterol 
levels in patients with previous m yocardial infarc t ion. Patients eligible to the study (for detailed 
description of respective criteria see Pedersen et al.l ( 2004[) 1 were randomized to receive either 20 mg of 
simvastatin or 80 mg of atorvastatin, with a foreseen possibility to change dose at week 24. Lipoprotein 
levels (low density lipoprotein (LDL) - cholesterol, high density lipoprotein (HDL) - cholesterol, total 
cholesterol (TC), triclycerides (TG), apolipoprotein B-lOO (Apo B) , apolipoprotein A (Apo A)) from 
fasting blood samples along with lever enzymes and other laboratory measurements were taken at 
baseline, at 12 and 24 weeks, 1 year and yearly thereafter. 

The primary endpoint was defined as time to first occurrence of a major coronary event (CHD). 
However, to exploit more of the collected outcome information, we considered one of the broader 
secondary endpoints, time to ’any’ coronary heart disease event, since - as mentioned above - our 
main focus co ncerns the amo u nt of mediator measurements used. (For more detailed definition of the 
endpoints see jPedersen et al.l (|2005l) .l 


5.2 Patient selection for the present analysis 

From the originally 8888 randomised patients, after careful considerations, 8646 patients were left 
for our analyses, as we required a certain amount of information to see any mediation effects in our 
application of dynamic path analysis. More specifically, this means that we excluded patients that 
had no baseline information on the considered mediator (described below) and also those still under 
risk at week 12 but with missing mediator values, as those values are needed to actually observe an 
effect of treatment. If successive measures of the time-dependent covariate were not on hand, the last 
observation was carried forward. 


5.3 Event frequency and deviations from the assigned treatment regime 

From the previously mentioned 8646 patients, 4350 patients were originally randomised to 20 mg of 
simvastatin and 4296 to 80 mg atorvastatin, where of 1019 patients experienced the event of interest in 
the simvastin group and 852 in the atorvastatin group. However, the study design included the option 
to adjust a patient’s dosage at week 24. As a result of these possible adjustments a fairly large number 
of patients deviated from their allocated treatment at least once throughout the study period. In the 
present paper, we decided to focus on the effects of allocating the described intervention in practice 
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and therefore present our results from dynamic path analysis applying the intention-to-treat (ITT) 
principle, analysing every individual as randomised. 

Estimating effects comparing specific treatments at a specific dose would involve a more complicated 
analysis due to the special study design and is therefore subject to further work on its own. 


5.4 Potential Mediators 


The a priori understanding of the underlying mechanisms made LDL-cholesterol the logical candidate 
for a mediator. However, due to the ongoing debate in clinical literature, whether LDL-cholesterol 
or apolipopro t ein B (Apo B) should be considered as the target value for lipid therapy (see e.g. 


Ramiee et al. ( 201llB . we also took Apo B into consideration as a potential mediator. Since Apo B 


and LDL-cholesterol are highly correlated and it is still unclear in which way statin treatment effects 
Apo B, we refrained from applying a more complex joint model and run separate analyses for LDL- 
cholesterol and Apo B. To not to distract from the main points we indented to highlight, we only 
present the development of the proportion of the effect mediated over time for LDL-cholesterol and 
want to refer to supplementary plots of the actual path effects in the Apo B models. 


5.5 Results 


Throughout the IDEAL study blood lipids were measured at 7 measurement points after baseline. So 
we decided to compare models where all available repeated measurements were utilized compared to 
a reduced analysis were we only utilized the baseline and week 12 measurements of the mediator and 
discarded all measurements after week 12. The reason for using the week 12 measurement was just 
because one would not see any effect of treatment on the mediator and therefore any indirect effect 
directly at administration of treatment. 

In all models presented here the square root of LDL-cholesterol [mg/dL] was considered as the time- 
dependent mediating variable X 2 {t). The models for the treatment mediator relationship as well as 
the treatment outcome relationship included LDL-cholesterol at baseline (square root transformed), 
APO B level [g/L] at baseline (square root transformed), high density lipoprotein [mg/dL] at baseline 
(square root transformed) as well as smoking status, usage of statin treatments at randomisation, 
and furhter presence of hypertension, presence of diabetes, sex and age at baseli ne, as those covariate s 
were reporte d to affect the risk of CHD events and also the LDL-cholesterol levels iMendis et al.l pOllI l: 
■lehlfJ (I2nn2ll . 


In the left-hand column of Figure [7] the estimated cumulative total, direct and indirect effects of 
atorvastatin compared to simvastatin mediated through LDL-cholesterol are shown using all available 
repeated measurements applying the ITT principle. In the right-hand column of Figure [7] one can find 
the same arrangement of plots but for the analysis where only the baseline and week 12 measurements 
of the mediator was used. The week 12 measurement was carried forward for that analysis, mimicking 
a situation where one would only have access to one measurement. 

In Figure [S] the corresponding linear regression coefficients and the cumulative direct effect of LDL on 
the event outcome are displayed for these two models. Generally the patterns for the effect on LDL 
as well as the results for the indirect effect fit the intuitive understanding quite well, that statins at a 
higher dose have a rather rapid effect on LDL-cholesterol levels after receiving treatment, but it takes 
some time until these effects actually effect the risk of cardiovascular disease. 

Comparing the plots of the treatment effect on LDL-cholesterol in Figure[8]one can clearly see that the 
effect of atorvastatin compared to simvastatin on the mediating variable actually attenuates over time. 
By only utilizing the baseline and week 12 measurements, however, a constantly greater treatment 
effect on the mediator is assumed. Comparing the plots for the indirect effects one can observe that 
the indirect effect appears less pronounced when only the baseline and week 12 measurements are 
incorporated in the analysis. This could be due to the general tendency to underestimate the indirect 
effect that was also observable in the simulation study. Likewise, the direct effect of treatment behaves 
similar to our simulation results, where could observe a tendency for overestimation. Attempting to 
contribute to the ongoing discussion in clinical literature, we show the development of the proportion 
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of effect mediated through LDL-cholesterol and Apo B in different plots in Figure ID It can be observed 
that the proportion of the effect mediated through Apo B has a more pronounced peak within the first 
year and also stabilizes at a higher level thereafter. 


6 Discussion 

In this paper, we mainly addressed two issues around the model of dynamic path analysis. Most 
importantly we could show, that conditioning on the collider ’survival’ does not produce artificial 
associations between treatment and the mediator. This implies that whenever one finds the effect 
between treatment and the mediator to be changing over time, that phenomenon corresponds to 
an actual change in the relationship and is not spuriously introduced by survival selection. From our 
given argument the important notion follows, that unmeasured quantities that are not considered to be 
confounders but affect survival in an additive manner, will not turn into confounders as time passes by. 
A property that will not hold true for the Cox model. This is a conseque nce of the fact that the additive 
hazard model is collapsible, which is not the case for the Cox model Martinussen and Vansteelandtl 
( 2 ^. 

Other works around the dynamic path analysis model inve s tigate d specific confounding scen arios, but 
only considered a time-fixed mediator Martinussen et aP ( 20Illl : Ihange and Hans^ ( 2011 1. In our 
data example we, however, assumed no unmeasured confounding between exposure and outcome as 
well as between the mediator and the outcome and aimed at a causal interpretation. But we are aware 
that the estimated cumulative direct, indirect and total effects have to be interpreted with caution, as 
these assumptions are generally not testable. 

Concerning the results from our simulation study and our analy ses of the IDEAL d ata we mainly want 

( 2014l l. That is, when the 


to emphasis one issue that has, for example, been mentioned in lAalen et al. 
underlying mediating mechanism is actually working in time, performing mediation analyses that only 
incorporate one single mediator measurement can distort the real picture and one may risk to deduce 
false conclusions from such analyses. 

Looking at our presented plots, a pattern that could be observed all over was, that the bootstrap 
confidence intervals for the cumulative indirect effect appeared narrower than those for the direct effects 
and conseq uently for the to t al eff ect. This phenomenon could also be observed in the applications 
presented in Gamborg et al. ( 201lll . We speculate that that could be due to measurement error in the 
assessment of LDL-c holesterol and Apo B, which was reported to be a problem for multi-center studies 
Contois et al. ( 2011 1. 


Note that there are several other interesting challenges around the IDEAL data one could pick for 
more in-depth analyses. Fo r example, th e issue of non-compliance with assigned treatment, which had 


already been addressed in Holme et al. (200^. One possible extension would be to employ inverse 


probability of censoring weights to and further perform reweighted dynamic path analysis model, as 
suggested in lRovsland et al. ( 2011 1. 

From a practical perspective estimates of effects over time could improve study planning of future trails 
with similar objectives. For example by considering the mediator as a surrogate endpoint, the respective 
plots of the cumulative indirect effect could help to preplan the timing of interim measurements. 

In summary, dynamic path analysis is a useful tool for mediation analysis with a time-dependent 
mediator and a survival outcome. The obtained cumulative effect plots enable us to describe how 
direct and indirect effects evolve over time, which can add to the mechanistic understanding of the 
underlying processes. The linearity of the models for the treatment-mediator relationship as well as 
for the hazard appears to have appealing properties. 
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Total effect of treatment on CHD Total effect of treatment on CHD 
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Direct effect of treatment on CHD 


Direct effect of treatment on CHD 
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Indirect treatment effect through 
LDL on CHD 


Indirect treatment effect through 
LDL on CHD 
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Figure 7: First column: Estimated cumulative total, direct and indirect effect of atorvastatin compared 
to simvastatin mediated through LDL cholesterol using all available measurements; Second column: 
Effect estimates only utilizing the week 12 measurement; Grey lines represent the 95% conhdence 
intervals based on 200 bootstrap samples. 
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Figure 8: First column: Regression coefficients over time for the effect of atorvastatin compared to 
simvastatin on LDL cholesterol and cumulative direct effect of LDL cholesterol on CHD using all 
available measurements; Second column: Effect estimates only utilizing the week 12 measurement; 
Grey lines represent the 95% confidence intervals based on 200 bootstrap samples. 
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Figure 9: Proportion of treatment effect mediated through LDL-cholesterol vs. Apo B. 










A. The statistical model 


17 


APPENDIX: Causal interpretation in dynamic path analysis 
A The statistical model 

We will consider scenarios that fit the following combined longitudinal and time-to-event model: 

A.l Covariates 

The main event of interest occurs at time T > 0, represented by a counting process Nt- In addition, 
there are variables Xi,... ,X„ that can have outcomes before this event occurs. More precisely, we 
assume that the outcome of each variable Xi occurs at a deterministic time U if the main event has 
not already occurred, i.e. ii T > ti. We also assume that i < j means that the outcome of Xj can 
not occur strictly before Xi. This means that we have deterministic times ti < ■ ■ ■ <tn, possibly with 
repetitions. 

Let P denote the joint density that governs the pre-intervention frequencies of these variables, and 
let bi,... ,bn be functions on the form 

bk{Xi, ..., Xk-i) = bkfi + bk.iXi -I- • • • -I- bk.k-iXk-i- (7) 

Assume that 

Xk-bk{Xi,...,Xk-i) AL{Xi,...,Xk-i} \ T>tk, (8) 

and 

E[Xk\Xi,...,Xk-i,T>tk] = bkiXi,...,Xk-i), (9) 

for every k > 1 and let 

bi^o = E[Xi]. (10) 

A.2 Covariate processes and additive hazards 

The collection of possible events that could have occurred before t, the so-called history or filtration 
generated by Nt and {Xi\ ti < t} will be denoted J). We will also need to consider the history 
restricted to Nt and {Xi\ T < t,i ^ j}, which we will refer to as El- For mathematical convenience in 
the subsequent derivations, let us also introduce the following caglad function, s ^ Js from [0, oo) into 
the n X n-matrices such that the only non-zero entries of Jg are contained in the upper-left Ug x Ug- 
corner, where Ug := max{fc|tfc < s}. This gives us a vector-valued ’covariate process’ s JgX that is 
adapted to the filtration Eg. 

Finally, we assume that there exist functions /3° and /3s := (/3],..., /3”)‘'' such that 

as=/3s°+/3JJsX (11) 

defines the hazard of Ng with respect to the pre-intervention joint density P and the filtration Eg. 
Let us illustrate these concepts with the following example. Suppose we have a situation with a baseline 
variable and a ’process’ that takes the first value at baseline, while it takes a new value at a later time, 
say t. Further, suppose the hazard of T, at time s, only depends linearly on the baseline variable and 
the current value of the process. Now, let X^ denote the variable, X'^ denote the baseline value of the 
process and X^ the subsequent value of the process. This gives 

® ■“ [p Hs < t) I{s >t)J’ 


{mpj 


= Js 



so for the covariate process we have 
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so that the hazard can be written as 


as = /3° + /3JJs 



A.3 Constant regression coefficients 

Due to linearity, we can estimate the coefficients {bk,j}j>o by performing ordinary linear regressions 
among the survivors at any time after tk, i-C. 

Proposition 1. If we restrict to outcomes such that T > t > tk, then we have 

k-l 

E[Xk\Tt^] + E[Wk\T > t], (12) 

i=i 

where Wk ■= Xk - bk{Xi,.. .,Xk-i). 


B Causal inference 


B.l Causal validity 


We will say that the model is causal if, for every V = (Vi,..., P„), we can specify interventions that 
would have given joint densities P such that ® and ([8]) hold when each function bk is replaced by 
bk + Vk, while a still defines a hazard with respect to P. If the model is causal, we are able to calculate 
the causal effects from these interventions on the marginal hazard of N. 

Theorem 1. Let and d® denote the marginal hazards of Nt for P and P respectively. We have 
that 


a^ — a^ + [Pj Jt) -Vi + (/37 Jt)j ■ • ■ bi^^i^Vi, 


(13) 




where we sum over partition s y = (ii = i < i 9 . < ■ ■ ■ < ik-i < ik = 1), which correspond to the paths 
defined in Posen, Section 2.5 Posen et al. i fOOdA) . 


B.2 Mediation 


Suppose we could perform an intervention that would only perturb the component Xi, i.e. we obtain 
a joint density P, where only the function bi is replaced by bi + e. 

We see from (fT^ that the effect seems to propagate through the system. To isolate the effect that 
is mediated through a variable Xj, where j > i, we could instead manipulate the system such that Xj 
behaved as if Xi had been perturbed, while we let Xi remain unperturbed. 

Note that we have 


bj ( X ^,..., X 2 I e , X r,..., Xj r) — ^j i ? *" * ? j i^ ^j ^ * £. 

This means that if the model is causal, then we could intervene such that bj{Xi ,..., Xj-i) is replaced 
by bj{Xi ,..., Xj-i) + bj^i ■ e to obtain a scenario as we just described. Therefore, by Theorem [H the 
mediated effect on the marginal hazard of N is given by 

( 14 ) 

l>i P 


where we sum over partitions p such that p = (d = j < *2 < • ■ • < d-i < ik = 0- 
classical definition of natural indirect effects that builds on nested counterfactuals 


This resem b les th e 
Pearll (1200911 . 


see 


Robins and Greenlandl ( 19921) and Lange and Hans^ ( 201l[ l Note that we are able to define indirect 


effects without explicit use of nested counterfactuals by exploiting the linearity. It is an interesting 
observation that this, at least approximately, gives meaning for non-linear models where 6i,..., are 
replaced by smooth-functions, since smooth functions behave as if they are affine for small e. 
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B.3 Causal pathways and path-effects 

Suppose we are given a pathway: 


Xii Xi2 Nt, 


(15) 


where ij < ij+i- If the model is causal with respect to all the nodes then we can isolate 

the effect that flows only through this path by first isolating the effect that is mediated by Xi^, and 
then iterate this contrast for each ij such that 1 < j < k. This gives the following path-effect: 






...h 




(16) 


By Proposition [T1 the path-effect along (fTSl) . as defined in the dynamic path analysis (IFosen et al.l 
(I2nn6bli i. is simply the integral of (ITH)) in the special case e = 1. 


B.4 Direct effects 

To isolate the effect from Xi that is not mediated through the other covariates, we have to attenuate 
any other effect the intervention Xi Xi -i- e would have had. For i ^ j-, we want to change each 
function bj such that Xj would behave as if Xi was not perturbed, even if it was. Note that 

bj{Xi, ..., Xi-i,Xi — e, Xi^i, ..., Xj-i) = bj(Xi, ..., Xj-i) — bj^e. (17) 

This means that if we perturbed each function bj into bj — bj^e while we also perturbed Xi such that 
bi is replaced by -I- e, then dt — at represents the direct effect. The sum (USD telescopes for 

V — ( 0 , ..., 0 , 5 , bi-jijS^ • ■ •: k^ iS^j 


so the direct effect equals 


dt-at = {l3jJt).£. 


Note that the direct effect, as defined in the dynamic path analysis (iFosen et al 
the integral of (fT^ in the special case 5 = 1. 


(18) 

( 2006blB . is simply 


Proofs 

Proof of Theorem[J\ Let Vi denote the nxn-dimensional matrix where the i’th row equals (6^^ ... 6i,i-i 0 ... 0,) 

and all the remaining entries are 0. Moreover, let 


(^1 5 ■ • ■ 5 ^ n ) 


/3JJ,(/ + V„)...(/ + V2) 


/ a:i^ 


\x„ 


and note that 


(ftixi, ...,Xn)= 4>t{xil{t > h), . . .,XnI{t > t„)). 

Lemma 1. Each Wj is independent of Wi, ..., IFj-i conditionally on T > t > tj and 

MWi,.. .,Wj,E[Wj+i\T >t],.. .,E[Wn\T > t]) 
defines the hazard of T with respect to Tl with respect to both P and P. 


(19) 


(20) 
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Proof. For every continuous function h 
E[h{Wn)\Tr^n{T>t}] 

EjhjWr,) exp(- Jl MWi ,.. ■, n{T> f^}] 

F;[exp(- Jl MWi,..., Wr,)ds)\Ef-^ n{T> t„}] 

E[h{Wn) exp(- Jl + 0, Wn)ds)\Tl-^ n{T> t„}] 

F;[exp(- Jl 4>s{Wu ..., Wn-1,0) + 0.(0,..., 0, Wn)d.s)\Tf-^ n{T> t„}] 

_ E[h{Wn) exp(- Jl 0.(0,, 0, Wn)ds)\Ef-^ n {T > tn}] 

E[- exp(fl 0.(0,..., 0, W^)ds)lEf-^ n {T > t„}] 

_ E[h(Wn) exp(- fl 0.(0,..., 0, W„)ds)lT > 

F;[exp(- 0.(0,..., 0, W'„)ds)|r > 

=F;[/i(tF„)|r > t]. 

Suppose that the independence claim is true for n, n — 1,..., j + 1. The innovation theorem implies 
that (1^ gives the hazard with respect to El. 

Furthermore, we have that 

E[h{W,)\El-^P{T>t}] 

E[h{W,) exp(- /; 0.(VFi,..., W,,E[W,+^\T > s],..., E[Wr,\T > s])ds)\El-^ n {T > t,}] 
F;[exp(- Si 0.(VFi,..., W„E[W,+^\T > s],..., E[W^\T > s])ds)| n {T > t,}] 

F;[h(w,) exp(- /;^ 0.(0,..., 0, VF„ 0,..., o)ds)| n {T > t,}] 

F;[exp(- 0.(0,..., 0, W,-, 0,..., G)ds)\El-^ n {T > t,}] 

F;[h(W(, ) exp(- 0,(0,..., 0, hF„ 0,..., 0)ds)|r > tj\ 

F;[exp(- 0.(0,..., 0, 0,..., 0)ds)|r > t,] 

=E[h{W,)\T > t] 

So the independence claim is also true for j. The result therefore follows by induction. □ 

Lemma 2. 

Ep[hiWi)\T >t] = Ep[h{W, + V,)\T > t] (21) 

for every i such that t > U. 

Note that for convenience of notation in the following proof we introduce the following notation 
Wk{s) := Ep[Wk\T > s] and Wk{s) := Ep[Wk\T > s] 
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Proof. We have that 

Ep[h(Wn)\T > t] 

=Ep[h{Wn)\EfP^n{T>t}] 

_Ep[h{Wn) exp ( - Jl MWi ,..., Wn)ds)n{T> t„}] 

Sp[exp ( - Jl MWi,..., Wn)ds) IJ-”-' n {T > tn}] 

_ Ep[h{W^ + W) exp ( - .., W„-i, W„ + Vn)ds) | n {T > t^}] 

Ep [exp ( - (f,{Wl,...,Wn-uWn + Vn)ds) r^{T> tn}] 

_Ep[h{W^ + 14) exp ( - Wn)ds) n{T> 4}] 

~ Ep[exp ( - Jl MWi,..., Wn)ds) n{T> 4}] 

=Ep[h{Wn + n{T> t}] 

=4p[h(w„ + i4)|r>t]. 

Suppose that (1^ is satisfied for z = n, n — 1,..., j + 1. Now 

Ep[h{W,)\T > t] 

=Ep[h{w,)\Ei:^n{T>t}] 

Ep[h{W,) exp ( - MWi ,.. ■, Wr,)ds) n{T> t,}] 

Ep[exp ( - /;. MWi ,..., Wn)ds) n{T> tj}] 

Ep[h{W,) exp ( - /;. MWi ,..., WpW,+iis),Wnis))ds) | n {T > t,}] 

Eplexp ( - /;. cfsiWu. ■., W,, 14,+i(s),..., Wn{s))ds) | n {T > t,}] 

Ep[h{Wj) exp ( - /;. MWi ,..., WpWj+iis) + V,+u W„(s) + V„)ds) n {T > 4}] 

4p[exp ( - /;. , 14„ W,+i (s) + y,+i,..., W„(s) + Vn)ds) n {T > 4}]] 

4p[h(W, + V,) exp ( - /;^ ..., M", + V,, W,+As) + ^,+1, ■ • •, W^is) + V^)ds) \Fi-^ n {T > 4}] 

4p[exp ( - //. , 14,-1, W, + 4,, 14,+i(s) + 4,+i,..., 14„(s) + 4„)ds) |n {T > 4}] 

4p[/i(14,- + 4,) exp ( - 0,(14i,..., 14,, t4,+i(s),..., 14„(s)) + ..., 0, 4„ ..., 4„)ds) | n {T > 4}] 

” 4p[exp ( - /;. WpWj+i (s),..., 14„(s)) + ..., 0, 4„ ..., 4„)ds) | n {T > 4}] 

4p[/i(14, + 4,) exp ( - /;. </.,(14i,..., 14„ t4,+i(s),..., 14„(s)))ds) | n {T > 4}]] 

” Ep[exp ( - /;. 0,(141 ,..., 14,, 14,+i(s),..., 14„(s))ds) |n {T > 4}] 

=4p[/i(14,+4,)|^r'n{T>t}] 

=£;p[/i(14,+4,)|r>t], 

so (1^ follows by induction. □ 

Let denote the hazard of T with respect to Nt and P, and let 5° denote the hazard we would 
see if the frequencies had been governed by P. 

Now, 

Ep[PjJtX\T >t] = (l>t{Ep[Wi\T > t],... ,Pp[14„|r > tj) 

=cft{Ep[Wi\T > t] + 4i,..., 4p[14„|r >t] + 4„) 

=f)t{Ep[Wi\T > t],...,4p[14„|T > t]) + 4(4i,..., 4„) 
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By the Innovation Theorem, we have the following: 

a° = a° + PJMI + V,)(/ + V,_i)... (/ + Vz)!/. (22) 

Finally, (I13p follows by writing out the matrix products. □ 

Proof of Proposition [IJ 


fc-i fc-i 

E[Xk\Tt^] =E[J2bk,jX^+Wk\Pt^] =Y,hjX^+E[Wk\Et^] 

k-l 

= Y.^k,jXj+E[Wk\T>t], 

t=i 

where the last equality follows since WkWi ,..., Wk-i\T > t hy Lemma [T] □ 
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